Evidence for Growth of Microbial Genomes by Short Segmental Duplications

نویسندگان

  • Li-Ching Hsieh
  • Liaofu Luo
  • H. C. Lee
چکیده

We show that textual analysis of microbial genomes reveal telling footprints of the early evolution of the genomes. The frequencies of word occurrence of random DNA sequences considered as texts in their four nucleotides are expected to obey Poisson distributions. It is noticed that for words less than nine letters the average width of the distributions for complete microbial genomes is many times that of a Poisson distribution. We interpret this phenomenon as follows: the genome is a large system that possesses the statistical characteristics of a much smaller “random” system, and certain textual statistical properties of genomes we now see are remnants of those of their ancestral genomes, which were much shorter than the genomes are now. This interpretation suggests a simple biologically plausible model for the growth of genomes: the genome first grows randomly to an initial length of approximately one thousand nucleotides (1k nt), or about one thousandth of its final length, thereafter mainly grows by random segmental duplication. We show that using duplicated segments averaging around 25 nt, the model sequences generated possess statistical properties characteristic of present day genomes. Both the initial length and the duplicated segment length support an RNA world at the time duplication began. Random segmental duplication would greatly enhance the ability of a genome to use its hard-to-acquire codes repeatedly, and a genome that practiced it would have evolved enormously faster than those that did not.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Short Segmental Duplication: Parsimony in the Growth of Microbial Genomes

∗ We show that textual analysis of microbial complete genomes reveals telling footprints of their early evolution. If a DNA sequence considered as a text in its four bases is sufficiently random, the distribution of frequencies of words of a fixed length from the text should be Poissonian. We point out that in reality, for words less than nine letters complete microbial genomes universally have...

متن کامل

Growth of microbial genomes by short segmental duplications

A DNA sequence can be analyzed as a text of four letters by counting the times each word in the set of k-letter words occurs in the text. If the text is random and long enough, then the frequencies of word occurrence are expected to obey a Poisson distribution. Examination of complete microbial genomes shows that for k less than 9, the distribution has a width many times the width of a Poisson ...

متن کامل

Short Segmental Duplication: Model for Growth of Microbial Genomes

We show that textual analysis of microbial complete genomes reveals telling footprints of their early evolution. If a DNA sequence considered as a text in its four bases is sufficiently random, the distribution of frequencies of words of a fixed length from the text should be Poissonian. We point out that in reality, for words less than nine letters complete microbial genomes universally have d...

متن کامل

Universal Lengths in Microbial Genomes and Implication for Early Genome Growth

We report the discovery of a set of universal lengths that characterize all microbial complete genomes. The Shannon information [Shannon 1948] of 108 complete microbial genomes relative to those of their respective randomized counterparts are computed and the results are summarized in a two-parameter exponential relation: Lr(k) = (42± 21)× 2.64, 2 ≥ k ≥ 10, where Lr is a ”root-sequence length” ...

متن کامل

Minimal model for genome evolution and growth.

Textual analysis of typical microbial genomes reveals that they have the statistical characteristics of a DNA sequence of a much shorter length. This peculiar property supports an evolutionary model in which a genome evolves by random mutation but primarily grows by random segmental duplication. That genomes grew mostly by duplication is consistent with the observation that repeat sequences in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003